Search CORE

29 research outputs found

SwissNLP : the Swiss association for natural language processing position paper at Computational Linguistics European nAtional and Regional Associations Meeting (CLEARA-MEET) at LREC

Author: Cieliebak Mark
Hürlimann Manuela
Vogel Manfred
Publication venue: ZHAW Zürcher Hochschule für Angewandte Wissenschaften
Publication date: 01/05/2020
Field of study

Speech-to-text technology for hard-of-hearing people

Author: Cieliebak Mark
Galbier Jolanda
Hürlimann Manuela
Publication venue: European Research Consortium for Informatics and Mathematics
Publication date: 05/07/2022
Field of study

Hard-of-hearing people face challenges in daily interactions that involve spoken language, such as meetings or doctor’s visits. Automatic speech recognition technology can support them by providing a written transcript of the conversation. Pro Audito Schweiz, the Swiss federation of hard-of-hearing people, and the Centre for Artificial Intelligence (CAI) at the Zurich University of Applied Sciences (ZHAW) conducted a preliminary study into the use of Speech-to-Text (STT) for this target group. Our survey among the members of Pro Audito found that there is large interest in using automated solutions for better understanding in everyday situations. We now propose to take the next step and develop an application which uses ZHAW’s high-quality STT models

ZHAW digitalcollection

Dialect Transfer for Swiss German Speech Translation

Author: Cieliebak Mark
Deriu Jan
Hürlimann Manuela
Paonessa Claudio
Schraner Yanick
Vogel Manfred
Publication venue
Publication date: 13/10/2023
Field of study

This paper investigates the challenges in building Swiss German speech translation systems, specifically focusing on the impact of dialect diversity and differences between Swiss German and Standard German. Swiss German is a spoken language with no formal writing system, it comprises many diverse dialects and is a low-resource language with only around 5 million speakers. The study is guided by two key research questions: how does the inclusion and exclusion of dialects during the training of speech translation models for Swiss German impact the performance on specific dialects, and how do the differences between Swiss German and Standard German impact the performance of the systems? We show that dialect diversity and linguistic differences pose significant challenges to Swiss German speech translation, which is in line with linguistic hypotheses derived from empirical investigations

arXiv.org e-Print Archive

Overview of the GermEval 2020 shared task on Swiss German language identification

Author: Cieliebak Mark
Hürlimann Manuela
von Däniken Pius
Publication venue: CEUR Workshop Proceedings
Publication date: 01/06/2020
Field of study

In this paper, we present the findings of the Shared Task on Swiss German Language Identification organised as part of the 7th edition of GermEval, co-locatedwith SwissText and KONVENS 2020

ZHAW digitalcollection

Missing information, unresponsive authors, experimental flaws : the impossibility of assessing the reproducibility of previous human evaluations in NLP

Author: Belz Anya
Cieliebak Mark
Hürlimann Manuela
Reiter Ehud
Thomson Craig
Publication venue: Association for Computational Linguistics
Publication date: 01/01/2023
Field of study

We report our efforts in identifying a set of previous human evaluations in NLP that would be suitable for a coordinated study examining what makes human evaluations in NLP more/less reproducible. We present our results and findings, which include that just 13% of papers had (i) sufficiently low barriers to reproduction, and (ii) enough obtainable information, to be considered for reproduction, and that all but one of the experiments we selected for reproduction was discovered to have flaws that made the meaningfulness of conducting a reproduction questionable. As a result, we had to change our coordinated study design from a reproduce approach to a standardise-then-reproduce-twice approach. Our overall (negative) finding that the great majority of human evaluations in NLP is not repeatable and/or not reproducible and/or too flawed to justify reproduction, paints a dire picture, but presents an opportunity for a rethink about how to design and report human evaluations in NLP

ZHAW digitalcollection

Dialect transfer for Swiss German speech translation

Author: Cieliebak Mark
Deriu Jan Milan
Hürlimann Manuela
Paonessa Claudio
Schraner Yanick
Vogel Manfred
Publication venue: arXiv
Publication date: 13/10/2023
Field of study

ZHAW digitalcollection

ZHAW-InIT : social media geolocation at VarDial 2020

Author: Benites de Azevedo e Souza Fernando
Cieliebak Mark
Hürlimann Manuela
von Däniken Pius
Publication venue: International Committee on Computational Linguistics (ICCL)
Publication date: 13/12/2020
Field of study

We describe our approaches for the Social Media Geolocation (SMG) task at the VarDial Evaluation Campaign 2020. The goal was to predict geographical location (latitudes and longitudes) given an input text. There were three subtasks corresponding to German-speaking Switzerland (CH), Germany and Austria (DE-AT), and Croatia, Bosnia and Herzegovina, Montenegro and Serbia (BCMS). We submitted solutions to all subtasks but focused our development efforts on the CH subtask, where we achieved third place out of 16 submissions with a median distance of 15.93 km and had the best result of 14 unconstrained systems. In the DE-AT subtask, we ranked sixth out of ten submissions (fourth of 8 unconstrained systems) and for BCMS we achieved fourth place out of 13 submissions (second of 11 unconstrained systems)

ZHAW digitalcollection

STT4SG-350: A Speech Corpus for All Swiss German Dialect Regions

Author: Cieliebak Mark
Deriu Jan
Hartmann Julia
Hürlimann Manuela
Paonessa Claudio
Plüss Michel
Samardžić Tanja
Scheller Christian
Schmidt Larissa
Schraner Yanick
Vogel Manfred
Publication venue
Publication date: 30/05/2023
Field of study

We present STT4SG-350 (Speech-to-Text for Swiss German), a corpus of Swiss German speech, annotated with Standard German text at the sentence level. The data is collected using a web app in which the speakers are shown Standard German sentences, which they translate to Swiss German and record. We make the corpus publicly available. It contains 343 hours of speech from all dialect regions and is the largest public speech corpus for Swiss German to date. Application areas include automatic speech recognition (ASR), text-to-speech, dialect identification, and speaker recognition. Dialect information, age group, and gender of the 316 speakers are provided. Genders are equally represented and the corpus includes speakers of all ages. Roughly the same amount of speech is provided per dialect region, which makes the corpus ideally suited for experiments with speech technology for different dialects. We provide training, validation, and test splits of the data. The test set consists of the same spoken sentences for each dialect region and allows a fair evaluation of the quality of speech technologies in different dialects. We train an ASR model on the training set and achieve an average BLEU score of 74.7 on the test set. The model beats the best published BLEU scores on 2 other Swiss German ASR test sets, demonstrating the quality of the corpus

arXiv.org e-Print Archive

CEASR : a corpus for evaluating automatic speech recognition

Author: Benites de Azevedo e Souza Fernando
Cieliebak Mark
Gedik Esin
Germann Fabian
Hürlimann Manuela
Ulasik Malgorzata Anna
Publication venue: European Language Resources Association
Publication date: 01/01/2020
Field of study

In this paper, we present CEASR, a Corpus for Evaluating ASR quality. It is a data set derived from public speech corpora, containing manual transcripts enriched with metadata along with transcripts generated by several modern state-of-the-art ASR systems. CEASR provides this data in a unified structure, consistent across all corpora and systems with normalised transcript texts and metadata. We then use CEASR to evaluate the quality of ASR systems on the basis of their Word Error Rate (WER). Our experiments show, among other results, a substantial difference in quality between commercial versus open-source ASR tools and differences up to a factor of ten for single systems on different corpora. By using CEASR, we could very efficiently and easily obtain these results. This shows that our corpus enables researchers to perform ASR-related evaluations and various in-depth analyses with noticeably reduced effort: without the need to collect, process and transcribe the speech data themselves

ZHAW digitalcollection

ZHAW-InIT at GermEval 2020 task 4 : low-resource speech-to-text

Author: Benites de Azevedo e Souza Fernando
Büchi Matthias
Cieliebak Mark
Hürlimann Manuela
Ulasik Malgorzata Anna
von Däniken Pius
Publication venue: CEUR Workshop Proceedings
Publication date: 01/06/2020
Field of study

This paper presents the contribution of ZHAW-InIT to Task 4 ”Low-Resource STT” at GermEval 2020. The goal of the task is to develop a system for translating Swiss German dialect speech into Standard German text in the domain of parliamentary debates. Our approach is based on Jasper, a CNN Acoustic Model, which we fine-tune on the task data. We enhance the base system with an extended Language Model containing in-domain data and speed perturbation and run further experiments with post-processing. Our submission achieved first place with a final Word Error Rate of 40.29%

ZHAW digitalcollection